..J. 



t)OCUMENT RESUMr 



ED 251 506 

AUTHOR 
TITLE 

INSTITUTION 

SP0N3 AGENCY 
REPORT NO 
PUB DATE 
CONTRACT 
NOTE 
PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



TM 840 B03 



in 



IDENTIFIERS 



ABSTRACT 



Bachelor, Patricia; Buchanan^ Aaron 
Using Cluster Analysis to Solve Real Problems 
Schooling and Instruction. 
Southwest Regional Laboratory for Educational 
Research and Development, Los Alamitos, Calif. 
National Inst, of Education (ED) , Washington, DC. 
SWRL-'IR-'BS 
17 May 84 
NEG-00-3-0064 
44p. 

Reports - Research/Technica] (143) 
MF01/PC02 Plus Postage. 

^Cluster Analysis; Cluster Grouping; Computer 
Software; Correctional Education; ^Evaluation 
Methods; * Instructional Improvement; ^Mathematics 
Skills; Quantitative Tests; ^Remedial Mathematics; 
Research Design; Secondary Education; Statistical 
Studies; ^Student Evaluation 
Evaluation Problems; Pre Algebra; Training and 
Employment Prerequisites Survey 



This report describes cluster analysis methodology 
and illustrates its potential merits for educational research by 
describing a study designed to identify the natural subgroups 
existing among students beginning a secondary level remedial 
mathematics course. Cluster analysis forms groups of relatively 
homogeneous subjects represented in large data blocks that contain 
several different observations for each subject. While recently 
developed software packages make the computation more manageable, 
identifying the optimal number tff clusters, and making sense out of 
them, requires considerable knowledge of the subjects and variables 
being clustered. Cluster analysis was applied to data from the two 
Prealgebra surveys of the Training and Employment Prerequisites 
Survey given to almost 1,500 wards of the California Youth Authority 
enrolled in remedial mathematics classes. The Prealgebra Surveys 
cover the skills and concepts most common to mathematics instruction 
up through grade ?. Results indicate that even fairly well prepared 
students will be unsuccessful in traditional remediation programs 
which focus first on mastery of all types of complicated 
computational skills. Instead, general mathematics instruction should 
ccnsist of one course that redevelops basic concepts and skills for 
handling fractions and decimals, and another course on more advanced 
topics in general mathematics applications or an introduction to 
algebra where the requirements for complicated forms of computation 
are carefully controlled. (BS) 



*************************************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
*********************************>«************************************* 



ERIC 





>i2 





This document has been distributed to a limited audience for a limited purpose. It is not published. CopieKmay be 
made only with the written permission of SWRL Educational Research and Development. 4665 Lampson Avenil^. Los 
Alamitos. California 90720. The work upon which this document is based was performed pursuant to Contract 
Nt-0-00- 3-0064 with the National Institute of Education. SWRL reports do not necessarily reflect the opinions «^r 
policies of the sponsors of SWRL RAD- ^ 




SWML EDUCATIONAL RESEARCH AND DEVELOPMENT 

TECHNICAL REPORT 85 , 
May 17, 1984 

USING CLUSTER ANALYSIS TO SOLVE REAL PROBLEMS IN SCHOOLING AND 
INSTRUCTION 

Patricia Bachelor and Aaron Buchanan 

ABSTRACT 

The basic Issues of the methodology of cluster analysis as applied 
to educational research are discussed. The relative merits of the 
technique are far-reaching and deserving of more attention on the part of 
educational researchers and practitioners. An application of the 
technique on school and Instructional data are presented. 



ERIC 



4 



USING CLUSTER ANALYSiS TO SOLVE REAL PROBLEMS IN SCHOOLING AND 
INSTRUCTION 



Patricia Bachelor and Aaron Buchanan 



Consider that you have the opportunity to redesign 
coursework In Mathematics for students Mho have 
completed seven or eight years of schooS (not counting 
preschool or kindergarten) but have not seen a lot of 
success in Mathematics. Whatever else they may have 
done, they have fallen Seriously behind the pace of 
instruction by the end of grades 7 or 8, and they are 
not usually thought to be proMlsIng candidates at this 
point for regular Instruction in a first year-long 
course in algebra. These are students Mho typically 
floM into a ninth-grade course in general Matheaiatics 
and, If they can get by requIrcMents for graduation Mith 
only one year of Matheaiatics, are unlikely to take any 
more matheMatlcs— ever. 

Are these students basically alike? Should they all 
take about the saMe coursi In Matheaiatics at thTs point 
in their instructional history? Or are there enough 
differences Mlthln the group to offer tMO or three 
different kinds of course/options that Mill build upon 
Mhaty if anything, students knoM already, in other 
Mords, do their perfonaances onudifferent kinds of 
tasks, all taken froM fairly stralghtforMard things that 
they should have had several opportunities to learn at 
earlier grade levels, shoM Mostly likenesses that can 
all be built upon in about the saMe May? Or do they 
suggest basic differences that would be more adequately 
addressed by tMO or thr«e coursei/optlons. 



Improving the delivery of Instruction so that It better fits past 
accomplishments and present weaknesses of students who will benefit from 
It Is a very practical problem, but one that Is also very complicated. 
The reason this problem Is so complicated has everything to do with 
formal tests that are regularly used as part of school Instruction, with 
the Information that Is present— but not always obvious— In test results, 
and with how this Information can be organized to tell us something about 
how to reshape Instruction Intended for whole populations of students who 



5 



are at a common point In school. The main problem Is how to organize 
what can be huge amounts of data In ways that tell us more about the 
structure of the population of students that the data represent. 
Educational groups have not made much headway In solving this problem, 
but researchers and practitioners In other areas of science have, and 
cluster analysis Is one of the tools they have used to help them. 

Cluster analysis Is a family of statistical proceo^res designed to 
'treate" clusters within a set of data. It is used extensively In many 
areas of science to organize things according to their likenesses and 
differences within large blocl<s of complicated data, and It has the 
potential to be applied successfully tc some very practical problems In 
schooling and instruction, j^bei:^/ there are large sets of data 
consisting of many observations such as test scores for Individual 
students, classrooms, or schools, cluster analysis has great potential to 
assist In sorting out groups of students, classrooms or s/hools that 
appear within the data to be more alike than dtfferentX^ It Is especially 
useful when we have several observations for each student, classroom, or 
school. Consider, for example, the observation of students' performances 
in mathematics. These days, the tests that schools use to observe . 
mathematics performances provide scores for individual students on 
several different mathematics objectives. ^ However, most of the 



^Sometimes, these scores are for a variety of mathematics subskllls 
rather than very specific mathematics objectives, but the difference has 
more to do with nomenclature than with the substance of what Is tested. 
In current practice, "subskllls" and "objectives" are both represented by 
the same kinds of test Items grouped within the test In about the same 
way. 



grouping thatt schools do for purposes of Instruction Is still based on a 
single, overaW mathematics score. The result Is a "high," a "medium," 
and a "low" groub based only on the overall scores of Individual stu- 
dents; all of the deHns represented by the high performances and low 
performances of IndlvIdualVtudents on different objectives Is "averaged" 
away. Schools don't pay ;nuch atte^KJon to different scores made by xJIf- 
ferent students on different object I vev^fecause all of these differences 
are too hard t6 keep track of when decls!ons\about the grouping of stu- 



help to slmpl Ify this, 
1^ more power to respond 



dents are being m^e. Cluster analysis could 
problem and, at the same time, glve-sdjools 

to the diversity of needs and a/compl lshmenbSsj;MidIng within a large 

I ^ # ^^^^^^ 

group of students, but people vfio run sch^ls— and otfi&N^jeople who 
advise them--don^t know about it. 

■3} / 

I f cluster analysis Isy^g >€^actlcai tool for schools to use\it 

Is /Partly because edijpafional researchers have not provided much 

leaJ^rship. CXTster analysis Is simply not a procedure that's used veijy 

much in educational research. It was not really used very widely in ahy 
I 

area of science prior to the time that the computer became well- 
established as a tool for processing huge quantities of researcjj/6ata. 
Since then, cluster analysis has become a practical tool InyMxonomlc 
sciences, where, for example, there is a lot of interest In Identifying 
different classes of flora and fauna that are precisely alike in some 
respects and precisely different In others. It Is also used widely In 
Information sciences, where researchers use a technique called 
co-cltatlon analysis to Identify new fields of scientific endeavor based 



\ 



on observations of the topics that are covered by articles ^ub) ished in 
scientific journals and the references that are made, in the text of these 
articles to worlcs of other authors which are published elsewhere. These 
daysi researchers in education could be using cluster ^aoalys is to study 
not only large populations of students but also populations of schools 
and classroomsi based pn observations of liicenesses and differences that 
are collected into large data bases on a fairly routine ba^is-^but they 
don't. As a tool for malcing sense out of Information buried in large 
data bases^ the Icind that exist in growing abundance in° local school 
districts, state departrpents of education) and federal bureaus, cluster 
analysis has yet to have much impact on education. 

Several months ago, SWRL staff began to use cluster analysis to shed 
more light on the structure of large p^ulatlond a/ students "representing 
fairly broad units of* i|)structlbn. What we have been loolcljig at directly' 

r- 

Is the' instructional accomplishments of different students, different 
classrooms, and different schools based on the proficiencies they appear 

to demonstrate on formal assessments for different Icinds of school 

i ■ 

subject matter such as reading, mathematics, composition and science. 

The instructional accomplishments which we observed were tal<en 
directly from the raw percentages for different sicill areas within, say, 
a mathematics assessment, that are common-ly reported as test results. 
For an individual student, these results are the percentage of assessment 
items for a particular sl<ill area that the student answers correctly,* for 
classrooms and schools, the results on a particular sicill area are 
averages of the results obtained by individual students within a 
classroom or within a school. 



8 



What we wanted to find out by using cluster analysis was the 
following: ' 

To what extent do the accomplishments of students, or classrooms, or 
schools form meaningful clusters that^iuggest that, based on their 
accomplishments, we are not dealing with one large group of 
students, classrooms, or schools that are all basically alike but 
are faced with several smaller groups that are obviously quite 
different? 

We were not doing research on clustec analysis, as such. Rather, we were 
applying procedures for doing cluster analysis— procedures that already 
exist In packaged software th&t Is almost universally available to 
researchers. In fact, more than applying procedures for doing cluster 
analysis, which are as easy as the software package Is "user-friendly," 
v^e were grappling with the problem of Interpreting results In ways that 
had direct and obvious Implications for shaping Instruction for students 
as a population. That's the hard part, because there simply hasn't beert 
enough use of cluster analysis on questions regarding schooling and 
Instruction for us to have many precedents to follow. But -that's getting 
ahead of the story. To fully appreciate the problems Inherent In Inter- 
pretation of the results of cluster analysis , one needs to be fairly 
well grounded In what cluster analysis— In this case, cluster analysis 
software— Is designed to do In the first place. 

What cluster analysis does 

The getieral objective of cluster analysis Is to partition a set of 
subjects Into subgroups that are as homogeneous as possible. Sometimes, 
the differences between these subgroups are large and obvious and we 
assume they are meaningful other times, the differences are so small 



'that It would be impossible to make a case that one subgroup should be 
thought of as being any different from the other. In the latter case, 
the "mathematics" of cluster analysis gives us subgroups that are 
different In a strictly technical sense, but, realistically, the ways In 
which they are different don't appear to be wo?H much concerh. 

To the extent that cluster anal^s generates subgroups that are 
meaningful, a careful look at these subgroups and how they relate to one 
another will allow the user toi 

• Identify natural clusters within a mixture of subjects that may 
represent several different populations 

• Construct a useful scheme for- putting subjects Into different 
classes " 

• Find out whether or not classes that are believed to be present 
within a certain population are actually there 

• "Snoop" within a population for unsuspected clusters 

Cluster analysis forms groups or clusters of relatlve'y homogeneous 
subjects that are represented In large blocks of data that contain sev- 
eral different observations (e.g., "scores") for each subject. These 
groups are formed on the basis of how "close" together Individual 
subjects are in the data base. At the heart of cluster analysis, "close- 
ness" is measured mathematically in different ways depending on the 
method for cluster analysis that Is being used. In most software pack- 
ages closeness Is measured by some form of what is known as Euclidean 
distance between points In the data base or by a form of what's known as 
sums of squares. Either way, these methods are designrsd to form clusters 
so that distances between individual subjects are as small as possible 
within clusters and as large as possible between clusters. The thing to 

10 



keep fn mind Is that methods for clustering data are designed to create 
clusters- whether or not any meaningful clusters actually exist. For 
example, the cluster method will, at some poinf, partition a datajbase 
containing, say, 40 subjects Into ten clusters even though th\6p^ base 

may not actually contain an^ cluster that's meanliigful, much less 10. In 

*■ 

■I 

ot4ier words, the different methods for clgstfe. analysis are designed to 
grind out clusters according to some mathematical specifications that are 
built Into the method. It Is the researcher's Job to decide yhen there 
is enough difference between clusters within a certain configuration for 
the clusters themselves to be meaningful. 

How Cluster Analysis Works 

■ V y o . 

What really happens Isthlss the <Jipta in a cluster analysis are 
assembled Into a matrix, of a rectangular array of numerical entries, 
corresponding to the observations made on each varjable for each subject. 
The rows represent N subjects while the columns represent n variables, 
such as test scores on different skill areas In mathematics. A complete 
row of "scores" for each skill area may be considered the subject's 
profile. Next, the data matrix Is transformed Into a square MxN matrix 
of measures of distances that exist between each pair of subjects. 2 



^ysuaHy these are Euclidean distances, sums of squares, or 
correlation coefficients. And at this point, the researcher using a 
software package has already selected the method for determining distance 
available In the package that Is most appropriate for the kind of data 
that will be clustered. However, there does fextst a large body of mea- 
sures that can be used In cluster analysis which may be divided Into four 
major groupsT" one, distance measures which are generally used with ' 
continuous or ordinal data but can also be applied to binary or qualita- 
tive dataj two, association coefficients generally used with binary or 
qualitative dataj three, correlation coefficients, such as Pearson s r 
for continuous data or £hl for binary dataj four, probabilistic 
similarity coefficients based on Informatjon statistics. 



11 



8 ■ 

• / • . ...■,.*"> 

/ ' , . ■ . ^ 

' . •■• V " ■ i . ■ 

Actual clustering begins a^ soon as the ihlatrix of distance measures 

is complete. (This ail takes the computer blit a few seconds to ^o.) 

• ^ • ■■ ■ ' 

There afe several methods for forming clusters, but most software pacl(- 

ages only use the most popular ones, and these are known as hierarchical 

methodia^ The hierarchical methods can be /Classified as^ divisive or 

aggiomerative. Aggiomerative, techniques which are the nfK>st common, 

start' by treating ail N subjects as individual clusters .and th^n proceed 

step-by-step to form new clusters from subjects that are closest 

together. ^At each step, the two entities that are closest dre combinad 

to form a new cluster. Sometimes this means combining two individual , 

subjects that^^e not already in a cluster; sometimes it means' conbfntng 

an individual subject with a cluster that has been formed previously; and 

'v sometimes it means combining two clusters that have been formed previa 

ousiy. The result is to form bigger and bigger clusters until, fini>ny, 

there is only one huge cluster that contains all N subjects. Divisive 

techniques operate In tlip reverse. This process starts with all N 

subjects in one cluster and then divides this cluster into two clusters, 

then three, and so on until there, are N clusters that each contain one 

subject. 



^Clustering techniques are broadly c^^^. "iable into nonhierarchlcal 
methods and hierarchical iwsthods. The nonhier®rchical--or single level — 
procedures are of two basic types. One technique involves iterative 
partitioning of subjects into multiple clusters. Typically, some form of 
optimizing critericn is applied to relocate subjects from o/te cluster to 
another after the initial assignment. The jsethod begins with a predeter- 
mined number of'clusters and, through various iterative processes, tries 
to find a revised classification which will make ttie distances frwn sub- 
ject to subJect'Within each cluster as .small as possible in combination 
with maximizing the dlst^^ces between clusters. 



12 



ERIC 



The use of cluster analysis Involves almost no assumptions about the 
underlying structure of the data base, which makes It a very desirable 
Instrument for many different types of appl I ed^ research. However, there 
are two Important issues that need to be addressed early In the design of 
the research. These are (a) how to know when you have an optimal number 
of clust'irs and (b) how to Interpret these clusters once you've got them. 
Software packages usually provide stome kind of statistic Intended to ' 
"measure" differences that exist betwisen clusters In a particular config- 
uratlon. Whatever else goes into the decision Is based on practical 
considerations of how many clusters can reasonably be handled within the 
Interpretation— especial ly when the purpose of Interpretation Is to 
generate practical implications for how to improva schooling and 
Instruction. 

While recent advances and Interest In cluster analysis have lead to 
software packages that„ make the computational drudgery that accompanies 
cluster analysis more tractable, the Identification of an optimal number 
of clusters and making sense out of clusters in an optimal solution still 
require a good deal of outside knowledge about the subjects that are 
being clustered and the variables (e.g., scores on different skill areas 
In mathematics) on which the clustering Is based. There do exist some 
guidelines for selecting the number of clusters. Unfortunately, these 
guidelines are mostly tied to work In areas other than schooling and 
instruction, and they don't provide much help. 

Proceeding to the second phase of analysis, let's suppose that, In 
looking at performances in mathematics, we suspect that three clusters 



13 



/ 



are optimal. First, we need to get, some practical sense of how different 
the clusters are; second, we need to provide some explanation of why 
these differences occur, and, better still, what implications they have 
for improving instruction. Until now, there has not been enough worit 
done on cluster analysis to provide us with much of any model to follow, 



but we do Icnow that there are some areas where interpretations could 



easily go wrong. In our hypothetical case here, where we suspect that 
three clusters are optimal It would be easy to try to portray them as 



"high," "average," and "low" clusters. The t^oObl^wlth these labels Is 



that they are consistent with our Intuitions that most instructional 



groups have students at three different levels of ability, but they are 



Instruction over a long period of time. All students are taught about 
the same things at about the same point of time. Moreover, they tend to 
learn about the same things In the same order, especially in mathematics. 
For several reasons, some students learn sane things sooner than others, 
and so we are Iil<e1y to see basic differences in the things^^especial ly 
the number of things — that different subgroups of students can do. 
Differences in ability undoubtedly play some role in how soon different 
students can do what they've been taught, but It's only one of many 
things that affect instructix)n, and it isn't a very useful consideration 
In deciding what t' teach and when to do It. 

Often, in what we call a 3-cluster solution In cluster analysis, 
there may be two clusters that both contain a series of high and low 
performance on different skill areas. Skill areas that show high 





not consistent with the regularity that we know exists In the effects of 



11 

performances for one cluster may show low performances for the other» and 
vice versa. Sometimes, It may be hard )to see any^ meaningful differences 
between the two clusters, and In this case, It may be ne-'-essary to look 
at a 4-cluster or 5-cluster solution to actually "see" fairly clear 
differences among two or three major clusters. General ly^ If we try to 
Interpret a solution containing too few clusters then we run the risk 
that one large cluster masks differences between variables which differ- 
entiates between the clusters of subjects. If, on the other hand, there 
are too many clusters, then the Interpretation becomes clouded and dif- 
ferences between subjects begin to overwhelm similarities that exist 
between clusters of subjects. 

Sdroe traditional statistical methods may at first seem helpful In 
"testing" a decision about an optimal number of clusters, but quiie 
frequently these are not valid for use within the cluster analysis 
methodology. For example, ordinary significance tests, such as F-testS; 
are not valid for testing differences between clusters. Since clustering 
methods attempt to maximize the separation between clusters, the assump- 
tions behind the usual significance test are drastically violated. Also, 
most valid tests for clusters either have Intractable sampling distribu- 
tions or involve null hypotheses for which rejection is vacuous.^ 



'♦For clustering methods based on distance matrices, a popular null 
hypothesis is that all permutations of the values in the distance matrix 
are equally likely. Using this null hypothesis one can do a permutation 
test or a rank test. The trouble with permutation hypotheses Is that 
with any real data, the null hypothesis U totally implausible even if 
the data do not provide any useful Information. 



12 

AppHcatlon of cluster analysis to Real School Tata; An 1 1 lustration 

Over the past several months , we have had several opportunities to 
apply cluster analysis to real data on school accomplishments resulting 
from administration of some assessment instruments that have been devel- 
oped by SWRL staff. In one case, cluster analysis was used to see what 
kind of natural clusters that elementary 'Schools might fall into based on 
average school performances on different objectives Included in a science 
assessment. Most recently) we have been using cluster analysis to looic 
at natural subgroups that exist among students who are Just beginning a 
format remedlol course in general mathematics at the secondary level, 
~and| so far, this has provided us with the cle<irest Illustration of the 
potential that exists for using cluster analysis to solve compl icatedi 
but very practical problems, that go with interpretation of Achievement 
\^arta so that there are clear Implications for designing school 
coursework. 

Th4^ data for our illustration came from results of what we call the 
Training and Employment Prerequisites Survey (TEPS). This survey was 
given to wards who are enrolled In remedial mathematics classes provided 
as part of Its education program by the California Youth Authority. In 
mathematics, the TEPS series consists of two surveys, Prealgebra A and 
Prealgebra B, which cover the nK>st salient skills and concepts that are 
most cTinon to grade-by-grade instruction In mathe.uatlcs up through about 
grade 7» 

Prealgebra A represents skills and concepts that schools would cover 
In regular Instruction by about the middle of grade 4. This Includes: 



13 



• use of whole numbers up to one^hundred thousand, 

• use of simple common fractions for part of a set or region, 

• addition and subtraction of whole numbers, 

• multiplication and division of 2-dIgIt and 3-dIgIt whole numbers 
by a number that Is ten or less, 

• simple measurement skills Involving length, time and money, 

• recognition of simple geometric shapes, 

.• solution of basic types of word problems Involving addition, 
subtraction and multlpl Icatlon. 

• • * 

Prealgebra B continues with: 

• use of large whole numbers, 

• use of equivalent fractions, 

• simple relationships between fractions, decimals, and percents, . 

• computation with whifle number^ up through multiplication anci 
division by 2-dlgIt numbers, 

• computation with fractions^ decimals, and percent, 

• measurement skills that are a little more advanced than ones 
covered in Prealgc' ra A, (but they still deal with length, time, 
and money), 

• basic kinds of word problems involving computation that Is a 
little more advanced than the computation required by word 
problems In Prealgebra A. 

Altogether, we had data from almost 1500 students. About two-thirds 
of them took Prealgebra B, because their Instructors felt that Prealgebra 
A was too elementary to show a broad range of mathematical tasks that 
they could or could not perform. The rest of the students took Prealge- 
bra A because, just prior to the time of our assessment, thi 'ad been 
working on the very simplest of mathematical skills and concepts. 

Cluster analysis was applied only to data from students who had been 
In instruction two months or less. This distinction Is important in data 
coming from education programs provided at correctional Institutions. 
Courses do not usually have fixed beginning and ending points, as they do 
In regular high schools/ because students are coming and going on a 
continuous schedule. Ou^maln interest was to see what kinds of clusters 
might exist among students who are entering a course In remedial 



14 

Instruction, so we did hot look at data from students who had been fn a 
mathematics course for more than two months. From the data set for 
students who had only been In remedial work for a short time, we 
generated several random samples, each consisting of data from about 25 
students. Ten samples were generated from results on Prealgebra A and 
ten from results on Prealgebra 6. Five samples for each survey were used 
to look for basic patterns i.i the kinds of clusters that seemed to occur 
naturally and the other five were used to try to verify the patterns that 
occurred In the first five samples. All of the data processing for the 
cluster analysis was carried out using (^LUSTER programs that are part of 
a software package from SAS.5 

Analysis of Results from Prealgebra B X 
The examples we are providing here com^ from results of processing 
three of our ten random samples using CLUSTER. The first example from 
sample 3 is shown In Tables 1 and 2. It includes data from 25 students 
who took Prealgebra B. In the terminology of cluster analysis, the 25 
students are cases s^d the nine skill areas on Prealgebra B, each 
involving about five to seven mathematics problems, represent the vari- 
ables. The performances (percent correct) for each student on each skill 
area constitute the data on which CLUSTER Is performed. The results of 
CLUSTER contain a lot of different kinds of information about 



5sAS User's Guide: Statistics. 1982 Edition. SAS (Statistical 
Analysis System) Inc., Box 8000, Cary, North Carolina. 



o 

ERIC 



18 



Table 1 



Cluster Analysis Pre-Algebra A (Sample 3) 
Name of Observation or Cluster 



N 

U 
M 
B 

E 
R 

0 
F 

C 
L 
U 

s 

T 
E 
R 

' S 



s 


S 


8 


s 


S 


8 


8 


8 


8 


T 


7 


T 


T 


T 


T 


T 


T 


T 


U 


U 


U 


U 


U 


U 


U 


U 


U 


D 


D 


D 


D 


D 


D 


D 


D 


D 


E 


E 


E 


E 


E 


E 


E 


E 


E 


N 


N 


N 


N 


N 


N 


N 


^s 


N 


T 


T 


T 


T 


T 


T 


T 


T 


T 


N 


N 


N 


N 


N 


N 


N 


N 


N 


U , 


U 


U 


U 


U 


U 


U 


u 


U 


M 


M 


M 


M 


M 


M 


M 


M 


t» 


B 


B 


6 


B 


B 


B 


8 


B 


B 


E 


E 


E 


E 


E 


E 


E 


E ' 


E 


R 


R 


R 


R 


R 


R 


R 


R 


R 


J7 


jb 


-5. 


4 


ii 


\0 


2^ 


-L 


i9 



8 
T 
U 
D 
E 
N 
T 

N 

U 
M 
B 
E 
R 



8 
T 
U 
D 
E 
N 
T 

N 
U 
M 

B 
E 
R 



8 
T 
U 
D 
E 
N 
T 

N 
U 
M 
B 
E 
R 



T 
U 
D 
E 
N 
T 

N 
U 
M 
B 
E 
R 



T 
U 
D 
E 
N 
T 

N 
U 
M 
B 
E 
R 



8 
T 
U 
D 
E 
N 
T 

N 
U 
M 
B 

E 
R 



8 
T 

U 
D 
£ 
N 

T 

N 
U 
M 
B 

. E 
R 



T 
U 
D 
E 
N 
T 

N 
U 
M 
B 
E 
R 



8 


8 


8 


8 


8 


8 


8 8 


T 


T 


T 


T 


T. 


T 


T T 


U 


U 


U 


U 


U 


U 


U U 


D 


D 


D 


D 


D 


D 


0 


E 
N 




E 
N 


E ' 
N 


zm 


UIZ 


E E 
N N 


T 




T 


T 


T 


T 


T T 


N 


N 




\N 


N 


N 


N N 


U 


U 


U 


U 


U 


U 


U tJ 


M 


M 


M 


M 


M 


M 


M 


B 


B 


B 


B 


B 


B 


6 B 


E 


E 


E 


E 


E . 


E 


E . E 


R 


R 


R 


R 


R 


R 


R R 



17 JJ. 12 



1 ♦xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

2 ♦xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

3 ♦xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

4 ♦xxxxxxxxxxxxx xxxxxxxxxxxxx 

5 ♦xxxxxxxxxxxxx 

6 ♦xxxxxxxxxxxxx 

7 ♦xxxxxxxxxxxxx 

8 ♦xxxxxxxxxxxxx 

9 -ixxxxx xxxxx 
xxxxx 
xxxxx 
xxxxx 



23 22 14 ^ ^ ^ ^ ^ J2. -2 -a JA. oa. 

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXJXXJCXXJ« 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxmjj 



10 ♦xxxxx 

11 ♦xxxxx 

12 ♦xxxxx 

13 ♦xxxxx 

14 ♦. 

15 ♦. 

16 ♦. 

17 ♦. 

18 ♦. 

19 ♦. 

20 ♦. 

21 ♦. 

22 ♦. 
23 

24 ♦. 

25 ♦. 



xxxxxxxxx 
xxxxxxxxx 
xxxxxxxxx 
xxxxxxxxx 
xxxxxxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 



xxxxxxxxxxxxx 
xxxxxxxxxxxxx 
xxxxx xxxxx 



xxxxxxxxxxxxxxxxxxxxxxx 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
xxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxx 

XXXXXXXXXXXXXXXXXXXXXXXXX itXXXXXXXX 

xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxx . xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxxxxxx xxxxxxxxx 

xxxxx . xxxxxxxxx 

xxxxx . xxxxxxxxx 

xxxxx . xxxxxxxxx 

xxxxx . xxxxxxxxx 

xxxxx . xxxxx 
xxxxx 



Xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 



xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 



XXXXXXXXXXX?«XXXKXXX.XXXXXX 

xxxxxxxxxxxxxxxxxxxxxxm 

XXXXXXXXXXXXXXXXXXXXXXXXX 

xxxxxxxxxxxxxxxxxxxxxxxxx 

xxxxxxxxxxxxx xxxxxxxxx 

xxxxxxxxxxxxx xxxxxxxxx 

xxxxxxxxxxxxx xxxxxxxxx 

xxxxxxxxxxxxx xxxxxxxxx 

xxxxxxxxxxxxx xxxxxxxxx 

xxxxxxxxx . xxxxxxxxx 

xxxxxxxxx . xxxxxxxxx 

xxxxxxxxx . xxxxxxxxx 

xxxxxxxxx . . xxxxx 

xxxxx . . xxxxx 

xxxxx . . xxxxx 

xxxxx . . xxxxx 

xxxxx . . xxxxx 
xxxxx 
xxxxx 



o 

ERIC 



19 



Table 2 



Cluster Analysis Pre-Algebra B (Sample 3) 
Ward's Hierarchical Cluster Analysis 

Simple Statistics 





MEAN 


STO DEV 


Mt«)LENO 


0.62400 


0. 29620 


FRACTION 


0.68667 


0. 273S2 


DECIMALS 


0.36800 


0.?9257 




0.82857 


0. 24046 


CCM1PFRAC 


0.546S7 


0. 31047 


COMPDECI 


0.S3500 


0. 23B05 


COMPPERC 


0. 63200 


0.31979 


MEASURE 


0.60O00 


0.351t9 


PR0EILEM8 


0. 7S200 


0.31770 




C I us I er Means of Var i qms Solutions bv SkiH flrg^ 





SCjI ution 


N 


WHOLE NU 


^KACI ION 


D\CIMALS 


CUMPWHLE 


CQMPFRAC 


COMPDECI 


COMPPERC 


MEASURE 


PROBLEMS 




4 t. lusters 


4 
4 
10 
7 


. 1 bOOO 

. t jUOOO 
.57143 


.41667 
. 37500 
. 90000 
.71429 


. 300O0 
. 20000 
. 540000 
.257143 


. 64286 
.57143 
.97143 
.87755 


.285714 
.214286 
.857143 
.448980 


. 343750 
. 375000 
.712500 
.482143 


.20000 
.45000 
.82O0O 
.71429 


. 10000 
.30000 
.86000 
.68571 


. 10000 
.70000 
.94000 
.88571 




3 clusters 


8 
10 
7 


. JNOOO 

. mooo 

.5714J 


. 39583 
. 90000 
.71429 


. 250000 
. 540000 
.257143 


.60714 
.97143 
.87755 


.250000 — ^,359373 
.857143 .712500 
.448980 .482143 


. 32500 
. 82000 
. .71429 


. 20000 
. 86000 
.6897 1 


.40000 
. 94000 
.88571 




2 clusters 


a 

17 


. ytj294 


. 39583 
. 82353 


. 250000 
. 423529 


.60714 
. 93277 


. 250000 
. 689076 


. 359375 
.617647 


. 32500 
S 77647 


.20000 
. 78824 


. 40C00 
•91765 




1 ( I ust er 




. 62400 


. 68667 


. 368000 


. 82857 


•548571 


. 33500O 


.63200 


.60000 


.■^ 52O0 


o 

ERJLC 










• 


• 


* 








$ 



17 

different cluster arrangements wlth!n the data set, as Is true with most 
statistical packages that are available these days. The basic components 
that one would be most likely to use In Interpreting CLUSTER results are 
shown In Tables 1 and 2. 

Table 1 shows the optimal ways to separate all 25 students Into 
first one cluster, then two, then three clusters and so on. The key ts 
to look at the rows that start on the left side of the table. For 
example, the best arrangement for two clusters Includes the first eight 
students In one cluster, reading across the top of the table, and the 
other 17 students In the second cluster. In the 3-cluster solution, the 
cluster with 17 stMdents splits In two clusters, one with 10 students and 
the other cluster with nine, while the cluster with eight students stays 
Intact. Finally, at the bottom of the table, we are left with 25 clus- 
ters where each student Is defined as an Individual cluster. Looking 
from the bottom up, 25 students In 25 clusters doesn't have much signif- 
icance If one Is thinking about shaping different kinds of courses to fit 
basic differences In past accompl I shments— differences that are burled 
within a set of performance data In mathematics. In real schooling and 
Instruction, It makes more sense to look at the top of the diagram In 
Table 1 to see what kinds of differences exist between clusters when you 
go from one cluster, that Includes all students, to a maximum of four or 
five clusters. 

For real applications to schooling and Instruction, It helps to 
think of cluster analysis In reverse. Instead of forming clusters of 
things that are most alike. It's easier to think about "forcing" clusters 



22 



18 



to split. You begin with one cluster, that Includes all 25 students, and 

then "force" it to split into two, ostensibly Sit the weakest point In the 

linicages between students' performances that, altogether, constitute the 

entry-level accomplishments that a course of instruction might be 

designed to build upon. You then force the weal<est of the two clusters 

to split, now forming three clusters, and then force the weakest of these 

to 5plit to form four clusters, and so on. Looking at things In this 

way, one Is asking a question that Is very basic to design of 

instructional coursework; 

within a population of students, are all students about the 
same, or are there different subgroups of students who are 
quite different In terms of past school accomplishments as 
these accomplishments are now represented by different levels 
of performance on different skill areas of mathematics. 

Table 2 provides Information about different clusterings that helps 
to sort out various strengths and weaknesses within this group of 25 
students. First, there are the simple statistics which show mean 
performance levels for the entire sample of 25 students on each of the 
nine skill areas. The sample as a whole had a relatively high perfor- 
mance level on computation with whole numbers (COHPWHLE) where, on the 
average, they answered about 831 of the items correctly. On the other 
hand, the performance level on decimals was quite low. Here, students 
answered, on the ave age, about 37% of the Items correctly. Mainly, the 
problems In this skill area required students to give equivalent decimals 
and percents for common fractions, such as 3/20, or mixed numbers, such 
as 5 3/5* The b6ttom o^Table 2 shows the same kind of statistic, mean 
performance levels on each skill area, for each of the clusters In 



23 



19 



»'l-c1uster," "a-cluster," "S-cluster," and "i»-c1uster" solutions. The 
means for the 1-cluster solution are the same as the simple statistics 
for the entire sample that are shown at the top of tke Table. For the j 
other solutions, these cluster mean% can show us where we are dealing 
with several subgroups rather than a single group. 

The 2-cluster solution In Table 2 shows us that our sample of 25 
students Is composed of at least two subgroups. One subgroup with 17 
students In It has performance levels on each skill area that are 
consistently 30* to higher than the remaining cluster of eight 
students. Only on DECIMALS and COMPDEC (computation vlth decimals) was 
the spread between the two clusters less than 30 percentage points. In 
fact, except for DECIMALS, the differences between these two clusters are 
so great that It would be foolish to try to design a single course In 
mathematics that could possibly buHd on the past accomplishments of 
these two groups of students? they are simply too different. On the 
' other hand, would two courses take care of things, or are other dif- 
ferences still burled In either of these two clusters? The answers are 
no and yes— no, two courses wouldn't be sufficient because, yes, there 
are still subgroups that are quite different burled within the larger 

cluster of 17 students. 

Consider the 3-cluster solution, where the cluster splits Into two 
smaller clusters of seven and ten students respectively. Most of the 
"spreads" In cluster means that resulted from splitting the cluster of 17 
students Into two smaller ones are of the nature of 20 percentage points 
or less—but not all. There are huge differences between these two new 



24 



20 

clusters when It comes to performances on whole numbers (WHOLENO)— 88% 
compared to 57%t decimals (DECiMALS)Y5{f% compared to 26%, and computa- 
tion with fractions (COMPFRAC)— 86!^ compared 451. a Were 'It not for such 
large differences on these three skill areas. It Is concefvable that a 
common course In mathiematics could build adequately on the profile of 
accomplishments represented by these two Clusters, even If students from 
the two clusters were taught In the same course. An Instructor would 
have to take some precautions to be sure that the cluster of seven 
students received a little extra review on topics related to most of the 
skill areas, but that should not be an insurmountable difficulty. As It 
is, the differences between these two clusters on whole numbers, deci- 
mals, and especially, computation with fractions, are so large that It Is 
inconceivable that an Instructor could. In the saim course, build ade- 
quately on past accomplishments of the 17 students taken as a single 
group. 

The 4-cluster solution shows that additional differences also exist 
within the cluster of eight students who demonstrated relatively low 
levels of performance In our 2-cluster solution. When this cluster 
breaks into two smaller clusters, each containing four students, there 
are fairly large spreads in two skill areas: whole numbers (WHOLENO) and 
problem solving (PROBLEMS). However, there are reasons to be cautious st 
this point. The spreads In performance levels for the two new clusters 
occur on whole numbers (15%) » computation with percent (201) and problem 
solving (10%). The tasks in each of these skill areas requires consider- 
ably more reading than tasks in the other skill areas, so the new 



25 



"performance spreads" we see In the^-cfuster scl^^tlon couW a1 1 be tied 

■ ■ •*'^. ' ' " 

to students who have little proficiency f<jf reading, English.' 

- . ■ , ' „ 4 

The 3-cluster solution for sample 3 contains, some patterns among the 

twenty-seven cluster means that suggest very clearly the kfnds of profl- 
clencles that the 25 students In thls> sample do have for Instructors to 
build upon. First, there Is the cluster of ten ".students Who ^e already 
fairly well prepared. Except for numeration with decimals and percent 
(DECIMALS) and computation with decimals XCONPDEC), -students In this 
well-prepared cluster, on the average, were successful more than, of 
the time on problems In each of the other skill areas. Bycon^rast, 
the 'I ^ ^ "under-prepared" cluster of eight students who were , success- 
ful less than about ^0!^ of the time on problems In all sklVKareas except 
computation with whole numbers (COMPWHLE). In-between, there Is a 
cluster of seven students who were relatively successfql on problems 

• *■ 

deal Ing with: 

fractions (FRACTIONS) 

computation with wholar^mbers (COMPWHLE) 

computation with per/ent (COMPPERC) . 

measurement (MEASURE) 

problem solving (PROBLEMS) 

• ♦ 

and relatively unsuccessful on problems dealing with: 

numeration with decimals and percent (DECIMALS) 
computation with fractions (COMPFRAC) 
computation with decimals (COMPDEC) 

The difference between clusters In the 3-cluster solution Is clearest If 

one first looks at numeration for decimals and percent (DECIMALS) and 

computation with whole numbers (COMPWHLE). On these two skill ?reas, the 

relationships that exist between clusters are fairly obvious. Keep In 



26 



22 



mtnd that numeration for dectmals and percent comes relatively late in 
the traditionai jiiathemat ics textbook series intended for study throughi 
grade 8, while computation with whole numbers has been around since about 
grade 3* In other words, the students who toolc Prealgebra B had undoubt- 
edly had less opportunity to learn numeration with decimals and percents 
and more opportunity to learn computation with whole numbers than any 
other topics covered in the survey. 

Note that in the B-cluster solution, fewer than about 50% of the 

J 

students in all three clusters were successful, on the average, on 
problems dealing with numeration for decimals and percents (DECIMALS), 
while more than 50% of the students in all three clusters were success- 
ful, on the average, on problems dealing with computation involving whole 
numbers (COMPWHLE). ° Using 50% as a kind of watershed for looking at the 
various cluster. means, we get a pattern of "pluses" and "minuse,s" as 
shown in Table 3, where "+" represents cluster means that are above 50% 
and "-" represents cluster means that are 50% or lower. Obvlousl.y, 
instructors who plan to teach a general mathematics course that either 
deals with or applies the arithmetic of fractions and decimals would have 
a fairly extensfve background of residual skills to build upon If the 
course were taught to students in CLUSTER Y and almost no residual skills 
to work with, if they were trying to offer the course to students In 
CLUSTER X. 

What Is even more compelling about the pattern of pluses and minuses 
in Table 3 is that It Is repeated, almost Identically, for each of the 
other nine samples of about 25 students selected at random from the 



27 



23 



Table 3 
Prealgebra B 

Cluster Means Above 501 (+) and at or Below 50% (-) 
]n a 3-Cluster Solution (Sample 3) 



Cluster X 
Cluster Y 
Cluster Z 



Cluster X 
Cluster Y 
Cluster Z 



WHOLENO 



+ 

+ 

COMPFRAC 



FRACTION 



+ 
+ 

COMPDEC 



DECIMALS 



COMPPERC 



+ 
+ 



COMPWHLE 

+ 

+ 
+ 

MEASURE 



+ 
+ 



PROBLEMS 



+ 
+ 



population of students taking Prealgebra B, These patterns are shown for 
all ten samples In Table 4. From this table, It Is clear that the three 
clusters of students we Identified In sample 3 are different from each 
other In almost exactly the same way In every one of the ten samples we 
analyzed. 

The pluses and minuses for our three clusters reprepresent different 
patterns of prof 'clencles that students who are channeled Into remedial 
coursework bring with them for Instructors to work with, They are 
present In our population of remedial students no matter how many times 
we chose a cross section of this population for our analysis. Moreover, 
these three cluster types were not originally part of a large subpopula- 
tlon of students that existed at only one Institution In the California 
Youth Authority, in other words, the various students who made up the 



28 



Table 4 



Prealgebra B 

Cluster Means Above 50^ {+) and at or Below 50% {-) in a B-Cluster 

Solution Across all Samples 



^ ^ m m mh mi 



+ + - + + 



::::: [W:: 1:1:: ::::: 

. + + + + 4.^4.4.4. + + + + + 



,2315 1 23i.5 1_lJj!_i 12 3^5 UJjLi LUAA LUAJ. LLLiJ. 



**y: *:*:: ::::: ::::: ::::: 
+++++ +++++ +++++ ♦ 



30 



25 



"wel 1 -prepared" cluster, Cluster B, In each of our samples— there were 



close to 100 of them altogether—came from many dlfferenty:lassrooms 



In many different Institutions. They were not "good" stud^ts who all 
came from the same school. This Is Important, because we need some 
assurance that, when we combined results for all students from different 
Institutions to form a population and drew out random samples of about 25 
students who had been jn Instruction for two months or less, we were not 
merely dividing up an Intact group of "good" students across our various 
samp^9S. 

Analysis of Prealqebra A 

Some results of cluster analysis on Prcalgebra A are shown In 
Table 5. These are cluster means for 2-cluster, 3-cluster, and 4-cluster 
solutions for the 32 'students who made up sample 3. As you can see, the 
structure of this sample Is quite different from structures we saw within 
the samples of students who took Prealgebra B, For one thing, little Is 
gained by splitting this sample Into more than two clusters, which 
consist of 15 students and 7 students respectively. In the 3-cluster 
solution, the cluster of seven students, splits Off a single student and 
leaves a new cluster of only six students. The same thing happens when 
we look at four clusters; a single student Is split from the cluster of 
six students. While the performances of this Individual student would 
most certainly be of concern to his Instructor, they don't tell us much 
about designing general purpose coursework. We would still need to 
design coursework to meet the needs of two different subgroups. All that 
would be gained by looking at a 3-cluster or a A-cluster solution would 




31 



Table 5 
Prealgebra A 

Cluster Means of Various Solutions by Skill Area 

Sample 3 







NUMERATION 


FRACTIONS 


ADO BASIC 


ADD OPR 


MULT BASIC 


MULT OPR 


ME/HSURE 


GEOMETRY 


PROBLEMS 


1 

2 

3 
4 


15 
5 
1 
1 


.89524 
.82857 
.71429 
.71429 


.92000 
.48000 
1.00000 
.80000 


.96667 
.93333 
.83333 
1.00000 


.90000 
.83333 
.83333 
. .66667 


-.98889 
.83333 
.83333 
.16667 


.93333 
.77143 
.42857 
.28571 


.82667 , 
.68000 
.40000 
1.00000 


.96000 
.72000 
.60000 
.80000 


.93333 
.84000 
1.00000 
,.80000 


1 
2 
3 


15 
6 
1 


.89524 
.80952 
.71429 


.92000 

.56667 
.80000 


.96667 
.91667 
1.00000 


.90000 
.83333 
.66667 


.98889 
.83333 
.16667 


.93333 
.71429 
.28571 


.82667 
.63333 
1.00000 


.96000 
.70000 
.80000 


.93333 
.86667 
.80000 


1 

2 


15 

7 


.89524 
.79592 


.92000 
.60000 


.96667 
.92857 


.90000 
.80952 


.98889 
.73810 


.93333 
.65306 


.82667 
.68571 


.96000 
.71429 


.93333 
.85714 



u 



33 



be an opportunity to see more clearly the nature of performances In the 
smaller cluster by weeding out some results that may represent aberra- 
tions In the mainstream of what Individual students In this cluster bring 
with them to Instruction. 

Another thing that Is clearly different about Pfealgebra A Is the 
fact that the performances of students In different clusters are a, lot 
more alike than they were In Prealgebra B, In the i»-cluster solution, 
perfbrmances In the cluster with 15 students are fairly close to perfor- 
mances In the cluster with five students except for tVK> or possibly three 
skill areas. There Is a difference of over kO percentage points on 
FRACTIONS and differences of a little over 20 percentage points on 
GEOMETRY and about 15 points on MEASUREMENT. Still, the performances of 
students In the smaller cluster are relatively high on geometry and 
measurement, especially when compared to performances we observed among 
different clusters of students who took Prea]gebra B. 

Clearly, most students who took Prealgebra A were fairly proficient 
on the residual skills that Prealgebra A represents. These are all basic 
concepts and basic forms of computation that elementary school textbooks 
routinely cover qult^ thoroughly by the middle of grade Notice that 
the well-prepared cluster in sample 3 is by far the largest one. It Is 
also the largest cluster In each of the other samples of students, except 
one, that we looked at from Prealgebra M. The size of clusters In the 
^-cluster solutions for all ten samples of data from Prealgebra A are 
shown In Table 6. The well-prepared cluster Is two to thre^ tiroes as 
large as the other major cluster In all samples except Sample 2 where the 
two large clusters are the same size. 

34 



Table 6 



Prealgebra A 

Cluster Sizes of the "Wei 1 -prepared" Cluster 
Under the Four Cluster Solution 



Sample 


1 


2 


3 4 


5 


1A 


2A 


3A 


4A 


5A 


Mean 


N 


17 


10 


15 19 


18 


11 


13 


18 


15 


19 


15.5 


Total Sample 
Size 


26 


2k 


22 29 


30 


19 


24 


33 


25 


29 


26.1 


% of Sample 


65% 


k2% 


66% 


601 


BB% 


54« 


55% 


60% 


66!^ 


591 



Looking back at Tab1> 5$ we see that there is no truly 
"under-prepared" cluster like the one that appeared consistently in all 
samples of data from Prealgebra B. Only in the 4-cluster solution do we 
see a cluster where students were successful, on the average, less than 
50!^ of the time, and, even then, it only occurred for the skili area that 
involved recognition of common fractions. Using 501 as a kind of 
watershed, as did earlier with results from Prealgebra B, we get a 
pattern of pluses and minuses for the two main clusters in sample 3 as 
shown in Table 7. This pattern is not as consistent as the patterns we 
saw for Prealgebra B, which were all almost identical, but, as Table 7 
sh(»tfs, they are very similar, in almost all sampies, there was one 
cluster where students were relatively unsuccessful on recognition of 
single fractions (FRACTIONS). In four out of the ten samples, there was 
a cluster where students were not very successful with multiplication and 



35 



29 




NUHERATION 



FRACTION 



ADDBASIC 



ADDOPER 



MULTBASIC 



Cluster X 
Cluster Y 



•+ 
+ 



+ 
+ 



+ 



+ 
+ 



MULTOPER 



MEASUREMENT GEOMETRY 



PROBLEMS 



Cluster X 
Cluster Y 



+ 



+ 



+ 
+ 



+ 
+ 



division beyond basic facts (MULTOPER), but all other major clusters 
showed moderate to high levels of success. 

Implications for Course Design 

The combination of results of cluster analysis of data from 



coursework for students who are typically headed for remediation at the 
secondary level. Each of our samples contained a substructure composed 
of two or three clusters of students who bring quite different sets of 
residual skills for Instructors to build upon. More significant Is the 
fact that each of \our samples of students who took the same survey, 
Prealgebra A or Prealgebra B, contained clusters that represented the 
same relative strengths and weaknesses. Recall that all of the students 
In our population were at about the same point In Instruction— they were 



Prealgebra A and Prealgebra B have enormous Implications for designing 



36 



30 

near the beginning of j course fn general mathematics after having been 
out of school for at least several months—but they were from seven 
different correctional Institutions that did not share a common program 
of Instruction. In fact, each ' Institution had been designated to receive 
wards who were In a certain age range or who represented a different need 
for supervision and security. The education programs varied widely from 
onft institution to another. The regularities that we saw among clusters 
of students In one sample after another represent a history of past 
instruction among students whose backgrounds are highly irregular. The 
fact that these inputs to Instruction are so regular tells us a lot about 
how common are the effects of past Instruction In the elementary grades. 
Students who have either had more experience or nK>re successful 
experiences bring more residual skills than other students to current 
coursework, but, more important, their residual skills are all about the 
same. i , 

In looking at means for different clusters on different skill areas, 
we saw patterns of pluses and minuses, based on whether means were above 
or below that were very regular across all of our samples from 
-Fi:ed4~gebra B and almost all samples from Prealgebra A. What Is even more 
dramatic is the fact that the combination of these means for the popula- 
tion as a whole yields the same pattern of pluses and minuses as shown In 
Tables 8 and 9. 

What the data in these two tables show most clearly Is that 
differences between clusters represent real differences In the population 
based on past Instruction. For example, clusters that have means below 



37 



Table 8 

Prealgebra B . 

Weighted Average of Cluster Means Above (•»■) 
and at or Below (-) 50* for Three Major Clusters 



Cluster X 
Cluster Y 
Cluster Z 



UHOLENO 
% 

87(+) 
59(+) 
^7(-) 



FRACTIONS 
I 

91(+) 
68(+) 



DECIMALS 
% 

57(+) 
21(-) 
19(-) 



COMPWHLE 
% 

97(+) 
BBi*) 
70(+) 



COMPFRAC 
% 

86(+) 

35(-) 
22(.) 



Cluster X 
Cluster Y 
Cluster Z 



COMPDEC 
I 

76(+) 
^2(.) 
35(-) 



CONPPERC 

I! 

87(+) 
67(+) 
38(-) 



MEASURE 
% 

70(+) 
32(-) 



PROBLEMS 
% 

97{*) 
83(+) 
37(-) 



AVERAGE 
NUMBER IN 
CLUSTER 



8.5 
8.9 
7.7 



Table 9 

Prealgebra A 

Weighted Average of Cluster Means Above (•>-) 
and at of Below (-) 50* fol- Two Major Clusters 



Cluster X 
Cluster Y 



NUMERATION 
% 

89(+) 
70(+) 



FRACTIONS 
% 

87(+) 
'»9(-) 



ADDBASIC 
I 

97(+) 
9M+) 



AODOPER 
% 

92(+) 
85(+) 



MULTBASIC 
% 

96(+) 
75(+) 



Cluster X 
Cluster Y 



MULTOPER 
% 

89(+) 
55(+) 



MEASURE 
% 

83(+) 
7M+) 



GEOMETRY 

91(+) 
71(+) 



PROBLEM 
% 

95(+) 
78(+) 



AVERAGb' 
NUMBER IN 
CLUSTER 



17.5 
6.7 



38 



32 



SO^^may vary a great' deal between 0% and 50%, but they vary In about the 
same way for pluses and minuses that, we would be Inclined to believe, 
represent about the same cluster of students on the same sicill area. In 
other words, we don't have a situation where minuses on, say, computation, 
with decimals represent leveis of performance that, based on cluster 
means, are iower for our B clusters than they are for our C clusters in 
the ten samples from Prealgebra B. 

In order for new courseworl< to tal<e maximum advantage of the 
patterns of residual slcills represented in Tables 8 and $ it will be 
necessary to rethinl( some issues that are basic to remedial instruction. 
As it is now, all of the students represented by these data are caught in 
a fairly painful cycle of remediation, and there is little likelihood 
that they will ever breal< loose. Certainly, another course in general 
mathematics, whatever it's called, will make little difference in what 
these students will be able to do once the cdurse Is finished. A course 
intended to force-march students to mastery on all possible ^pes of 
mathematics problems that might reasonably be classified under each of 
the sl<i 11 areas In Prealgebra B will not achieve much real success, even 
among fairly we 1 1 -prepared students In the A cluster. What will be 
achieved instead are a series of increments In what students will be able 
to do, and most of these will be related to computation with whole 
numbers and, perhaps, computation with decimals and percent. What will 
not be achieved is any l<ind of closure on the basic skills that lie at 
the core of each skill area. 



39 



33 

It would be more productive to base the design of new coursework on 

two considerations; 

■■ ■ \ ' 

1. What kind of coarse can build most effectively of^ the 

residual skills that students bring with them froin 
elementary school, especially given the fact that time Is 
limited to one or two semesters (or one to four quarters)? 

2. How can coursework be redesigned so as to break the cycle 
of remediation for the fairly well-prepared cluster df 
students that we know -<lsts7 

What these two considerations do for the redesign of mathematics 
coursework prior to a first course In algebra Is, more than anything 
else, to shift first priorities for Instruction away from mastery of 
complicated skills for doing accurate and precise computation with large 
whole number Si fractions, and decimals. Such skills are Important, but 
data from national surveys of how mathematical skills develop within the 
population of school age children and adolescents make one fact abun- 
dantly clear: 20 to 30% of the population do not master the full set of 
computation skills on whole numbers, common fractions, and decimals, no 
matter how many times they are recycled through remedial coursework. To 
make matters worse, most remedial coursework is sequenced so that rede- 
velopment of computation skills comes first, as a prerequisite to problem 
solving and other, more formal kinds of appl led mathematics, such as 
accounting and personal or business economics. The effect Is to guaran- 
tee two things: first, most students will not complete, much less get 
past, redevelopment of computation skills In the remediation cycle, and 
second, they still won't have reliable use of any of the operations on 
whole numbers, fractions, and decimals, except for addition and subtrac- 
tion whole numbers and money values. Their success in the remediation 



40 



V 

f 



34 . 

, i ■ ' 

cycle will be limited to. tacking blts^and pieces onto parts of 

computation skills they already havei whicli jieans thiey will still have no 

real power to handle numbers with any degree of confidence or 

reliability. - * , ^ 

What is more Important , given the clusterings of students and their 
residual skills, that we have seen here^ is to redesign coursework around 
opt ionsi— options that require reliable use of arithmetic operatlonsi ' ^ 
including such long- neglected skills as approximation and estimationi but 
d^ not depend upon highly accur^t^skll Is for doing precise computation. 
For examplei our analysis of Prealgebra B suggests three different kinds 
of course options for three different clusters of stu'dents. Students in 
our v«« 1 1 -prepared cluster. Cluster need a course that briefly reviews 
the relationship between common fractlonsi decimalsi and percents includ- 
ing the equivalence of different ways to express the same quantity. 
Beyond this reviewi the course should focus on more advanced topics that 
require use and interpretation of numbers without requiring much by-hand 
calculation. The beginning of a specially designed first course in 
algebra should not be excluded as an option. The fact that "advanced" 
coursework In general mathematics doesn't exist now should not be taken 
as evidence that it can't. More likelyi it shows the lack of any real 
challenge to instructors and to course developers because redevelopment 
of computation skills provided not only the focus of general mathematics 
but, in practice, it also defined the boundaries* 

Students In CLUSTER Y need a course that covers many of the same 
topics that are covered now In the middle half of a general mathematics 



41 



j^v/HRWMRi .... ^ 

M 

. 35 

textbook. They may need a brief review of computation with whole 
numbers, but redevelopment of this skill Is not necessary. They do need 
to redevelop computation skills with fractions and decimals. Instruction 
In this area should focus f Irs t^ on things, like rounding and estimation, 
before extensive work on basic forms of cQ,mputat Ion, so that students In 
this cluster get a maximum amount of power to handle appl Icat Ions that 
require computation, especially applications that Involve percent. 
Contrary to many pf our assumptions about the Inability of remedial 
students to solve word problems, the Indications from cluster analysis 
are that students In CLUSTER Y are, In fact, fairly proficient In 
handling basic word problems. What this means Is that a course In 
problem solving that Involves general mathematics already has at least a 
modest number of residual skills to build upon— it doesn't have to start 
from scratch. 

Students in the under-prepared cluster, CLUSTER Z, show only limited 
evidence of the residual skills covered In Prealgebra B. Thsy can 
obviously do some forms of computation with whole numbers, probably addl- 
tlon and subtraction, but little else Is In place. Students who have 
scores In the different skill areas that look about like the ones In the 
profile In Table 8 should take Prealgebra A for Instructors to get a more 
complete profile of the residual skills that they actual have to buM4 - 
upon. *^ 

The course options needed for students who tookf Prealgebra A are a 
little different. About two-thirds of the students are In a well- 
prepared cluster, CLUSTER X, and they should be given Prealgebra B to get 



42 



.36 

a better idea of the extent of their residual skills. Otherwise, there 
would be little alternative but^to^imgiiv-lntensive practice on computa- 
-tion WItn whole numbers for the purpose of being able to c(»npute faster 
and with greater reliability, instructrrs should look carefully at 
performances of students In Cluster Y from Prealgebra A. They should pay 
special attention to individual problems where individual students were 
unsuccessful. There li a good chance that many of these students have 
i United proficiency with English, which would mean that they could not be 
very successful oh any ski 11 areas that included much besides computation 
problem^. There Is also a g'lod chance that some of the students In 
CLUSTER Y did not complete all of the sections of Prealgebra A, perhaps 
because they ran out of time. However, what the results of the analysis 
show most clearly is that students In Cluster Y need a thorough review of 
tasks that are typically part of "recognizing" common fractions and 
mixed-numbers and a redevelopment of basic tasks involved In multipli- 
cation and division by a l-diglt number. When instruction in these two 
areas ^s„ completed, they should be prepared to1>egin about the same 
course as students in CLUSTER Y Prealgebra B. 

In summary I using cluster analysis we have shown a need for two 
basic course clusters In general mathematics: one that Includes at least 
one semester or two quarters of redevelopment work on/computat Ion with 
fractions and declmalsi and a second dealing with more advanced topics In 
the applications of general mathematics or an introduction to algebra 
where the requirements for complicated forms of computation ar^^^arefully 
controlled, at least at first# A lower-level course that begins^ far 



43 



37 




back as mul ttpllcatton and division facts doesn't seam to have much 
potential, although a small number of short well -organized modules 
dealing with single topics may be very useful for qulcl<ly preparing 
students to do some productive course v^rk with fractions and decimals.^ 
General mathematics should be one course that redevelops basic concepts 
and skills for handling fractions and decimals and another course that 
uses fairly adequate proficiencies for handling fractions and decimals to 
learn how to do something else. Based on what we've seen in our 
analyses, general mathematics should not become mired In mastery of 
pre-requlslte skills that mostly Involve whole numbers— but, that Is what 
it does now In too many courses for too many students. 



44 



